Explicit length modelling for statistical machine translation
نویسندگان
چکیده
منابع مشابه
Randomised Language Modelling for Statistical Machine Translation
A Bloom filter (BF) is a randomised data structure for set membership queries. Its space requirements are significantly below lossless information-theoretic lower bounds but it produces false positives with some quantifiable probability. Here we explore the use of BFs for language modelling in statistical machine translation. We show how a BF containing n-grams can enable us to use much larger ...
متن کاملDialogue Modelling for Statistical Machine Translation
The proposed project sets out to improve the quality of machine translation (MT) technology. Machine translation, known by the general public through popular applications such as Google Translate, is defined as the automatic translation from one language to another through a computer algorithm – for instance, translating a text from Japanese to Norwegian or vice-versa. In a globalised world whe...
متن کاملModelling pronominal anaphora in statistical machine translation
Current Statistical Machine Translation (SMT) systems translate texts sentence by sentence without considering any cross-sentential context. Assuming independence between sentences makes it difficult to take certain translation decisions when the necessary information cannot be determined locally. We argue for the necessity to include crosssentence dependencies in SMT. As a case in point, we st...
متن کاملFixed Length Word Suffix for Factored Statistical Machine Translation
Factored Statistical Machine Translation extends the Phrase Based SMT model by allowing each word to be a vector of factors. Experiments have shown effectiveness of many factors, including the Part of Speech tags in improving the grammaticality of the output. However, high quality part of speech taggers are not available in open domain for many languages. In this paper we used fixed length word...
متن کاملImproving Pronoun Translation for Statistical Machine Translation
Machine Translation is a well–established field, yet the majority of current systems translate sentences in isolation, losing valuable contextual information from previously translated sentences in the discourse. One important type of contextual information concerns who or what a coreferring pronoun corefers to (i.e., its antecedent). Languages differ significantly in how they achieve coreferen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition
سال: 2012
ISSN: 0031-3203
DOI: 10.1016/j.patcog.2012.01.006